<!DOCTYPE html>
<html lang="en">
    <!--
      Licensed to the Apache Software Foundation (ASF) under one or more
      contributor license agreements.  See the NOTICE file distributed with
      this work for additional information regarding copyright ownership.
      The ASF licenses this file to You under the Apache License, Version 2.0
      (the "License"); you may not use this file except in compliance with
      the License.  You may obtain a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License.
    -->
    <head>
        <meta charset="utf-8" />
        <title>CreateHadoopSequenceFile</title>

        <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css" />
    </head>

    <body>
        <!-- Processor Documentation ================================================== -->
        <h2>Description:</h2>
        <p>This processor is used to create a Hadoop Sequence File, which essentially is a file of key/value pairs. The key 
            will be a file name and the value will be the flow file content. The processor will take either a merged (a.k.a. packaged) flow 
            file or a singular flow file. Historically, this processor handled the merging by type and size or time prior to creating a 
            SequenceFile output; it no longer does this. If creating a SequenceFile that contains multiple files of the same type is desired,
            precede this processor with a <code>RouteOnAttribute</code> processor to segregate files of the same type and follow that with a
            <code>MergeContent</code> processor to bundle up files. If the type of files is not important, just use the 
            <code>MergeContent</code> processor. When using the <code>MergeContent</code> processor, the following Merge Formats are 
            supported by this processor:
        <ul>
            <li>TAR</li>
            <li>ZIP</li>
            <li>FlowFileStream v3</li>
        </ul>
        The created SequenceFile is named the same as the incoming FlowFile with the suffix '.sf'. For incoming FlowFiles that are 
        bundled, the keys in the SequenceFile are the individual file names, the values are the contents of each file.
    </p>
    NOTE: The value portion of a key/value pair is loaded into memory. While there is a max size limit of 2GB, this could cause memory
    issues if there are too many concurrent tasks and the flow file sizes are large.
</body>
</html>
